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An algorithm for estimating signal-to-noise ratio and combiner weight parameters for 
a discrete time series is presented . The algorithm is based upon the joint maximum likeli- 
hood estimate of the signal and noise power . The discrete-time series are the sufficient 
statistics obtained after matched filtering of a biphase modulated signal in additive white 
gaussian noise , before maximum likelihood decoding is performed . 


I. Introduction and Problem Model 

This article investigates maximum likelihood estimation of 
signal-to-noise ratio and combiner weight parameters for a dis- 
crete time series. The discrete time series are the sufficient 
statistics obtained after matched filtering of a biphase modu- 
lated signal (Ref. 1). In order to show the underlying assump- 
tions and limitations of the estimation problem, we first 
examine the communication system that gives rise to the 
discrete time series. 

We take as our model that given in Fig. 1. The channel 
encoder maps the binary digital source encoder output {I k } 
into the binary channel symbols {C^}, where the channel 
symbols are produced with rate 1/T. The modulation is 
biphase. That is, the modulator produces the baseband signal 


*(o = 2 Vfc(') o) 

k 

where the {,4^} are chosen according to 


-VF,c k = “ 0 ” 
+ c k - i 


( 2 ) 


Here, j E is the channel symbol energy, and the {^(0) are 
orthonormal basis functions. We assume that the {q k (t)} are 
time-displaced replicas of a single function of duration T, 
namely, 

%{t) = q(t-(k- l)T) (3) 

where 


q(t) = 0, t < 0 or t > T (4) 



( 5 ) 


The baseband signal s(t) is transmitted over an additive 
white gaussian noise channel with one-sided noise spectral 
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density A^. The received baseband signal is represented by x(f) 
in Fig. 1 . This received signal is demodulated by matched filter- 
ing (integrate and dump) to produce a discrete time series {x^}: 


** = 


r 

J(k-\)T 


x(t)q(t-(k- 1 )T)dt 


( 6 ) 


Assuming such ideal channel and receiver characteristics as 
perfect phase tracking and channel symbol synchronization, 
no channel symbol interference, etc., the output time series 
from the demodulator are the sufficient statistics for maxi- 
mum likelihood decoding. Referring to Eqs. (l)-(6), we see 
that this time series can be written in the form 




m A a . + a b, 

true k true k 


( 7 ) 


where the {< a } are either +1 or -1 depending upon whether 
the channel symbol transmitted was a “1” or “0,” {b k } are 
independent and identically distributed gaussian random vari- 
ables with zero mean and unit variance, and ra true and a true 
are given by 


TO 

( 9 ) 


SNR is a fundamental parameter of interest for a variety of 
reasons. For example, SNR is needed to optimally choose the 
quantization levels of the demodulator so that the “best” dis- 
crete channel is provided to the channel encoder-decoder 
(Ref. 2). We find it convenient to define a signal-to-noise 
ratio parameter p true for the demodulated time series as 

Ptrue “ m true^°mie 

In terms of P true , the SNR at the receiver is simply 

SNR = P tt J 2 O 2 ) 


Another quantity of interest is the combiner weight needed 
for symbol stream combining. For example, suppose L differ- 
ent time series (or symbol streams) are available from L 
different receiver-demodulators, 

x ik = m i a k + i= (13) 

where, as before, {a k } are either +1 or -1, and {b. k } are 
independent, identically distributed gaussian random variables 
with zero mean and unit variance. It can be shown (Ref. 3) 
that maximum likelihood decoding of the L time series {x ik } 
is equivalent to maximum likelihood decoding of a single 
time series {y k }, where 


The parameters ^ true and a true represent the true values of the 
signal and noise amplitudes, respectively. We note that a true 
and m are by definition non-negative. 

In order to make the problem mathematically tractable, we 
make the assumption that the {a k } are independent and take 
on the values +1 and -1 with equal probability. For a commu- 
nication system employing coding, this assumption is not cor- 
rect. Thus, the effect of coding on the estimation algorithm 
given here needs to be determined. 

II. Parameters to be Estimated 

Our starting point for all further analysis is the time series 
{**.} defined in Eq. (7), with the assumed probabilistic models 
for the sequence of random variables {< a k } and {b k }. Our 
objective is to find maximum likelihood estimates of the 
signal and noise parameters m true and a true or of other param- 
eters of interest that are embedded in the model. Two such 
parameters are signal-to-noise ratio and combiner weight. 

The signal-to-noise ratio (SNR) at the receiver is defined as 
SNR = EJN 0 (10) 


y * = E a t x tk (14 > 

i=i 

and the combiner weights {a f .} are chosen to be proportional 
to {mjo 1 .}. Thus, we are interested in estimating for any given 
time series a combiner weight parameter defined by 

“true = W <ru>true ( 15 > 

In different applications, we may desire to estimate one, 
two, or several parameters simultaneously. However, we 
should always be aware that our assumed problem model 
has exactly two independent unknown parameters. This 
implies that any estimate of a single parameter (such as SNR) 
must be aided by an implicit estimate of an independent 
auxiliary parameter, and that simultaneous estimates of more 
than two parameters are not all independent. In particular, 
maximum likelihood estimation as applied to our problem 
must produce a joint maximum likelihood estimate of a pair 
of independent parameters. 

Fortunately, it is not necessary to re-solve the maximum 
likelihood equations for every combination of parameters of 
interest. If two pairs of parameters are related by a one-to-one 
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transformation, then the corresponding joint maximum likeli- 
hood estimates are related by the same transformation (Ref. 4). 
Thus, we propose finding the joint maximum likelihood esti- 
mate of the signal and noise parameters m true and a true , 
which we denote as m and a, respectively. Then (ignoring a 
singularity at a true = 0 or a = 0) we can define the corre- 
sponding maximum likelihood estimates of the signal-to-noise 
ratio parameter p true and the combiner weight parameter 


p = m 2 la 2 

(16) 

a = m/d 2 

(17) 


III. The Log-Likelihood Function 

Let us denote a set of N measurements (jtj, * 2 , . . . , x N ) 
by the vector x. The probability density function of x condi- 
tioned on m tnie = m and a true = a is 


p(x\m, a) 



2a 


which after a little algebra becomes 


~(x k - m) 2 -{X k + mf 

i exp ; — + exp 


2a 


(18) 


* i -K 

p(x\m, a) = |T — - exp 

/c = l 


-m 


mx. 


2a 


exp 


cosh - 


2o z 


(19) 


Taking the natural logarithm of both sides of Eq. (19) gives 
the log -likelihood function: 


In p (x | m y o ) = 


-N In s/ItT -N ino - 



Nm 2 
2 o 2 + 


N 

y; In cosh 
k - 1 



( 20 ) 


IV. The Set of Feasible Solutions 

Consider the a - m plane where a is the abscissa and m is 
the ordinate. The joint maximum likelihood estimate (MLE) 


°f a true anc * m true * s ordered pair (a, m) in the a - m 
plane where In p(x\m, o) obtains its maximum. Let us define 
a set of feasible solutions to the MLE problem as a set of 
ordered pairs in the a - m plane of which the MLE (a, m) is a 
member. We wish to find a set of feasible solutions that is as 
small as possible. Since a true and m true are non-negative, we 
can restrict the set of feasible solutions to lie in the first quad- 
rant, including the non-negative a and m axes. 

A necessary condition for a function to obtain its maxi- 
mum at some point in the interior of a closed, bounded 
region is that its partial derivatives at that point are zero. 
Although the first quadrant of the a - m plane is not bounded, 
one can observe from Eq. (18) that for finite {**.}, p(x|m, a) 
approaches zero for large o and m. Thus, the maximum of 
In p{x\m, o) must be contained in some bounded region. 
Therefore, we include in our set of feasible solutions those 
points in the first quadrant (excluding the non-negative axes) 
at which both partial derivatives of In p(x\m, o ) with respect 
to o and m vanish. 

We must separately consider if the maximum might occur 
on the non-negative axes. Thus, a set of feasible solutions 
consists of those points in the first quadrant of the o - m 
plane where both partial derivatives of In p(x\m, o) vanish, 
and those points on the non-negative axes where In p(x\m, o) 
obtains a local maximum. Let us first consider the latter. 

A. m-Axis Solutions 

In the limit as a -+■ 0 (m-axis), we see from Eq. (18) that 
p(x\m, o) is proportional to the product of delta functions 
given below: 

N 

lim p(x\m f a) ~ (8 (x k - m) + 5 (x k + m)} (21) 

o— o 

In this case, one can see that if there exists some constant t 
such that \x k \ = c for ail k, thenp(x|m, a) is zero everywhere 
on the m - axis except at m = c, where it is unbounded. Con- 
versely, if the {x^.} are not all equal in magnitude, then 
p(x\m, o) is zero on the entire m-axis. Thus, since p(x\m, o ) 
is bounded everywhere except possibly the w-axis, we can 
state that 


(a, m) = (0, c) (22) 

if and only if there exists a c such that |x k | = c for all k. 

B. (r - Axis Solution 

For m = 0 (a-axis), we have from Eq. (18) that 
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A i ~ x t 

p(x\m o) = J| ■ exp — (23) 

*= 1 \j2no 2 2a 

This is just a unimodal gaussian density function with mean 
zero. It is well known (Ref. 4) that this function obtains its 
maximum at 



Thus, the only point on the a-axis that we need to include in 
the set of feasible solutions is 

( y4E x l > 0 ) < 24 > 

C. Interior First Quadrant Solutions 

The other members of the set of feasible solutions are the 
points in the first quadrant of the a - m plane (excluding the 
non-negative axes) where the partial derivatives of In p(x \ m, a) 
with respect to a and m vanish. Thus, we must find those 
ordered pairs (a, m) for which both a and m are positive and 
simultaneously satisfy 

— lnp(x|/n, o) = 0 (25) 

and 

— \np(x\m, a) = 0 (26) 

Performing the indicated partial derivatives on Eq. (20) 
leads to 


Setting Eq. (28) equal to zero leads to the relation between 
m and a: 

, u mx 

m = ~^l2 x k tanh r (29) 

k= 1 O 

Using Eq. (29), we can simplify the rightmost term in Eq. (27). 
Consequently, setting Eq. (27) equal to zero leads to the 
second relation between m and a: 

° 2+m2= j/T, x k ( 3 °) 

k= 1 

For simplicity of notation, let us make the definition 

= h < 31 ) 

Jt=i 

For now, since we are only considering positive a and m , we see 
from Eq. (30) that the feasible solutions (a, m) in the first 

quadrant (excludi ng the non-negative axes) must satisfy 

o < a < \J<X 2 > N and 0 <m< \J<x 2 > N . Using Eq. (30) to 
solve for o in terms of m and substituting into Eq. (19), we 
obtain a transcendental equation in one unknown: 

1 mx k 

m= -N2l x k tanh -7!T 2 ’ 0<m< V<x 2 > N 

k-l ~ m 

(32) 

Thus, given the measurements x k , k = 1, 2, . . . , jV, a set 
of feasible solutions in the int erior first quad rant consists of 
ordered pairs of the form (\/<x 2 > N - m 2 , m ), where m 
satisfies Eq. (32). Equivalently, m is one of the roots of the 
function F(m , x) = m - /(m, x), where 




k=\ 


f(m, x) = Y, x k tanh J7 


N 


mx. 


k= 1 


<x l > - m 2 
N 


(33) 


2m 

3 


N 


mx , 


2 x k tafl h (27) 


/t=l 


~tn = ^7 + “7 £ ** tanh ~r (28) 


k= 1 


V. Finding the Roots of F(m,x) 

Rather tha n findin g the roots of F(m, x) in the range 
0 < m < V <x 2 >^, let us extend this range to 0 < m 
< \/<x 2 > n . At first, one may think that we have needlessly 
increased the size of the set of feasible solutions defined in 
the last section. However, we will see in this section that 
finding the roots of F(m f x) in this new range of m includes 
the feasible solutions on the a and m axes. 
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Finding the roots of F(m t x) can be tricky. For example, 
looking for roots by investigating when F(m, x) changes sign 
may fail since F(m, x) may contain two or more roots very 
close together, or may in fact not change sign at a root. How- 
ever, insight can be gained by observing that the roots of 
F(m f x) are just the intersection of the curves: 

z = m (34) 

and 

z = f(m, x) (35) 

where m is restricted to 0 < m <\/<x 2 > N . 

It is interesting to note that f(m t x) is an even function in 
x k . That is, f(m, x) depends on each x k via its absolute value. 
This is not surprising since one can see that the conditional 
probability density function p(x\m, a ) in Eq. (18) depends 
only on \x k \. Thus, the absolute values of -{x fc } constitute a 
sufficient statistic and the sign bit of x k is not needed. Impor- 
tant properties of /(m, x) are listed in Table 1, where for 
notational convenience we have made the definitions: 

< i*i >N=iZ^ < 36 ) 

k - 1 

<* 4 >„ < 37 > 

1 

We shall now sh ow that the roots of F(m, x) in the range 
0 < m <\/<x 2 > n include the feasible solutions on the m and 
o axes, as given in Eqs. (22) and (24). First, we verify Eq. (24), 
which specifies the feasible solution on the a-axis. We see 
from Table 1, property (i), that m = 0 is always a root of 
F(m, x) fo r all x. But when m = 0, we have fro m Eq. ( 30) 
that or = \J<x 2 > n . Thus, the feasible solution ( \J<x 2 > N , 0) 
on the a-axis can be ob tained fro m finding the roots of F(m, x) 
in the range 0 < m < \/<x 2 > N . 

Next, we show tha t findin g the roots of F(m, x) within 
the range 0 < m <\/<x 2 > N also specifies the feasible solu- 
tion on the m-axis as given by E q. (22). I t is not too difficult 
to see from Eq. (22) 'that (0, y/<x 2 > N ) is the joint MLE if 
and only if there exists a c such that \x k \ = c for all k. How- 
ever, it is easily verified that |x fc i = c for all k implies; 


n/<* 2 >v = <!*!>„ (38) 


in which case we have using property (ii) of Table 1 : 


f(y/<x 2 > N ,x) = <|x|> Ar = \/<x 2 > N (39) 


Thus, m = \J<x 2 > N is a root of F( m, x) wh enever | I = c 
for all k. Furthermore, for m = \J<x 2 > N , we have from 
Eq. (30) that a = 0. It t hus follows that whenever the 
ordered pair (0, \/<x 2 > N ) is the joint MLE, it can always 
be obtaine d by look ing for the roots of F(m, x) in the range 
0 <m <\/<x 2 > ~ n . 

Having justified extending the search for the roots of 
F(m, x) to the range 0 < m ^ \/<x 2 > N , let us state what 
we currently know regarding the roots within this range. As 
mentioned before, m = 0 is always a root of F(m, x). Are 
there any nonzero roots? To answer this question, we first 
note that the c urve z = f(m, x) is not above the curve 
z = m at m = \/<x 2 > N . This is easily verified by invoking 
Jensen’s inequality 

<!*!>* < V<* 2 > a . (40) 

and using property (ii) of Table 1 to yield 


f{y/<X 2 > N ,x) = <1x1^ < >/<X 2 > N (41) 

Next, we observe that from property (vi) of Table 1, the curve 
z = f(m, x) rises above the curve z = m sufficiently near 
m = 0 if and only if the following critical condition is 
satisfied: 

<X*> N < 3 <x 2 >l (42) 

Thus, if Eq. (42) is satisfied, the curve z =f(m, x) must inter- 
se ct the cu rve z = m for some nonzero m less than or equal to 

v<» 2 v 

The condition in Eq. (42) is interesting because it parallels 
an easily verifiable relationship between the corresponding 
ensemble averages, namely, E {x 4 } < 3 E {x 2 f for m true > 0, 
and E {x 4 } = 3 E {x 2 } 2 for rn tTUe = 0. Thus, a nonzero root of 
F(m, x) is guaranteed whenever • the sample moments 
<x 4 > iV , < ^c 2 > n bear the same relationship as that relation- 
ship between ensemble moments which distinguishes the 
nonzero-mean case from the zero-mean case. 

Finally, we state one more property that is known concern- 
ing the roots of F{m , x). As m entioned before, |x fc | = c for 
all k implies that m = \/<x 2 > N is a root of F(m, x). The con- 
verse is also true. If \/<x 2 > N is a root of F(m, x), then 
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f(y/<x 2 > N , x) = y/<* 2 > N (43) 

and thus from property (ii) we have 

<I*I>,V = nA* 2 ^ (44) 

which implies \x k \ = c for all k, because this is the only condi- 
tion under which Jensen’s inequality, Eq. (40), can be satisfied 
with equality. 

We summarize our results in this section in the following 
theorem: 


L et m* de note the largest root of F(m , x), where 0 < m* 
< \/<x 2 > n . A graphical representation of the algorithm for 
finding m* is given in Fig. 3. At the zth iteration, m w is some 
estimate of m* 9 where m ® > m*. As the figure indicates, the 
better estimate is obtained by following the paths 

labeled (1), (2), and (3). One can see that the estimate m 
is closer to m* than the previous estimate m®, but is still 
larger than m*. From Fig. 3, we see that m is given simply 
by the recursion 

m^ l) = f(m^\x) (45) 

The zeroth estimate of m* is 


Theorem 1 

(1) The feasible solutions are of the form 
(V<x 2 >„ - m 2 , m ), where m is a root of F(m, x) in the 
range [O.V^ 2 ^] • 

(2) (a) F(m, x) always has the root m = 0 for all x. 

(b) F(m, x) has the root m = \/<jc 2 > n if and only if 

I = \/<x 2 > n for all k. 

(c) If <x 4 > N < 3 <jc 2 >j^, then there exists a non- 
z ero root of F(m, x) less than or equal to 


m <»> - 

Thus, performing Eq. (45) for / = 0, 1, . . . , generates the 
sequence of estimates > m > .... It can be shown 
that this sequence converges to m*. Since m* is the largest 
root of F(m, x), we see that upper bounds to our estimators 
are 


P 



[ m ^] 2 

<* 2 >JV - [™ (0 ] 2 


(46) 


In Fig. 2, we have sketched z = m and a hypothetical 
z -f(m, x) satisfying Eq. (42). For sake of simplicity, we have 
drawn the curves so that there are only two intersections. 


~ m ,, m 

a = < lim 

a 2 <x 2 > 


(0 


N 


[m (/ >] : 


(47) 


Unfortunately, Theorem 1 is all that we know regarding 
F(m, x). Several pertinent questions are: If Eq. (42) is satis- 
fied, is there only one non-zero root? If Eq. (42) is not satis- 
fied, are there any non-zero roots? And, finally, when there is 
more than one root, which one corresponds to the MLE? 
These questions have been very difficult to answer analy- 
tically. It would be “nice” if there were only one non-zero 
root when Eq. (42) is satisfied, and no non-zero roots other- 
wise. A few plots of F(m, x) indicate that this might be so. 
Properties of F(m , x) which might give some indication about 
the number of roots are currently being investigated. We are 
also in the process of looking for counterexamples. 


To obtain a qualitative understanding of how the rate of 
convergence of such an algorithm depends on SNR, let us 
make the definition: 


A ( ‘> = m®-m* (48) 

By the mean value theorem, there exists some m Q between 
m* and such that 


A (,+1) = - m* x) x) =-^-| 

dm 


A U) 

(49) 


VI. An Algorithm for an Upper Bound 
of the MLE 

Although Theorem 1 is somewhat incomplete concerning 
the number of roots of F(m y x), we can nevertheless give an 
algorithm for finding the largest root, which provides an upper 
bound to our signal-to-noise ratio and combiner weight esti- 
mators. We suspect that this upper bound is indeed the MLE 
and we show later that this is true in the large SNR case. 


Equation (49) gives us some idea about the rate of convergence 
of the algorithm. F or examp le, in the case of high SNR, m * 
should be close to \J<x 2 > N and in this case the partial deriva- 
tive of f(m, x) should be close to zero. Thus, one can see from 
Eq. (49) that would approach zero very rapidly. On the 
other hand, for low SNR, one would expect that m* would be 
closer to zero, and consequently the partial derivative of 
f(m, x) would be closer to its derivative at the origin, which is 
one. In this case the convergence would be very slow. 


37 



An algorithm for finding a root close to m* is given by the 
flow diagram in Fig. 4. The variable TOL is a pre-assigned 
tolerance for the difference between two successive estimates 
of m*. Also, a limit to the number of iterations of the algo- 
rithm is set by the variable NUM. This is needed in the low 
SNR case where the convergence of the estimate to m* may be 
asymptotically slow. 


If we use the above for m *, then the signal-to-noise ratio 
estimate is 


P 




<* 2 >jV-<l*l>^ 


(51) 


VII. The High SNR Case 

It is interesting to consider the high SNR case, especially 
since this serves as a check on our method. In the high SNR 
case, the {x k } will most likely be nearly equal in magnitude. 
Then, from Jensen’s ine quality, Eq. (40), <|x|> jV would be 
close to, but less than y/<j < 2 > N - Thus, from Fig. 3, we expect 
the i ntersecti on of the curves z = m and z = /(m, x) to be close 
to \J<x 2 > n> in which case just one iteration of the algorithm 
of Fig. 4 would yield a close estimate of m*, given below. 

m* ~ = f(sJ<x 2 > N , x) = <\x\> N , for SNR -K» 

(50) 


which asymptotically equals the usual signal-to-noise ratio 
estimate for the high SNR case (Ref. 5). 


VIII. Summary 

The main result of this memo is an algorithm for finding 
upper bounds to the maximum likelihood estimates of signal- 
to-noise ratio and combiner weights. Further work is needed 
to determine if these upper bounds equal the maximum 
likelihood estimates. 
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Table 1. Properties of f(m,x) 


mx) = o 


(i) 


lim / (m, x) = <|x|> 


3/ 


dm 


lm-0 


1 


(ii) 


(iii) 


j 9/ 

3m 




= 0 


\m->sj<x> 


N 


(iv) 


f(m, x) is monotonically increasing in m 


(v) 


To third order in m: 


f{m, x) ~ m + 


<jc 2 > 


N 


<x 4 > 


1 - 


N 


*<*H 


(Vi) 
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